Applying CNL Authoring Support to Improve Machine Translation of Forum Data

نویسندگان

  • Sabine Lehmann
  • Ben Gottesman
  • Robert Grabowski
  • Mayo Kudo
  • Siu Kei Pepe Lo
  • Melanie Siegel
  • Frederik Fouvry
چکیده

Machine translation (MT) is most often used for texts of publishable quality. However, there is increasing interest in providing translations of user-generated content in customer forums. This paper describes research towards addressing this challenge by automatically improving the quality of community forum data to improve MT results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rabbit to OWL: Ontology Authoring with a CNL-Based Tool

There is a recent trend of using controlled natural language (CNL) interfaces to provide more intuitive ways for entering abstract knowledge constructs [3]. This can reduce the complexity of knowledge formulation, which can lead to wider user involvement in ontology authoring and improved efficiency of the knowledge engineering process. However, CNL-based tools for ontology engineering focus so...

متن کامل

Intuitive ontology authoring using controlled natural language

Ontologies have been proposed and studied in the last couple of decades as a way to capture and share people’s knowledge about the world in a way that is processable by computer systems. Ontologies have the potential to serve as a bridge between the human conceptual understanding of the world and the data produced, processed and stored in computer systems. However, ontologies so far have failed...

متن کامل

MuTUAL: A Controlled Authoring Support System Enabling Contextual Machine Translation

The paper introduces a web-based authoring support system, MuTUAL, which aims to help writers create multilingual texts. The highlighted feature of the system is that it enables machine translation (MT) to generate outputs appropriate to their functional context within the target document. Our system is operational online, implementing core mechanisms for document structuring and controlled wri...

متن کامل

Implementing Controlled Languages in GF

The paper introduces GF, Grammatical Framework, as a tool for implementing controlled languages. GF provides a high-level grammar formalism and a resource grammar library that make it easy to write grammars that cover similar fragments in several natural languages at the same time. Authoring help tools and automatic translation are provided for all grammars. As an example, a grammar of Attempto...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012